Condensing biomedical journal texts through paragraph ranking

نویسندگان

  • Jung-Hsien Chiang
  • Heng-Hui Liu
  • Yi-Ting Huang
چکیده

MOTIVATION The growing availability of full-text scientific articles raises the important issue of how to most efficiently digest full-text content. Although article titles and abstracts provide accurate and concise information on an article's contents, their brevity inevitably entails the loss of detail. Full-text articles provide those details, but require more time to read. The primary goal of this study is to combine the advantages of concise abstracts and detail-rich full-texts to ease the burden of reading. RESULTS We retrieved abstract-related paragraphs from full-text articles through shared keywords between the abstract and paragraphs from the main text. Significant paragraphs were then recommended by applying a proposed paragraph ranking approach. Finally, the user was provided with a condensed text consisting of these significant paragraphs, allowing the user to save time from perusing the whole article. We compared the performance of the proposed approach with a keyword counting approach and a PageRank-like approach. Evaluation was conducted in two aspects: the importance of each retrieved paragraph and the information coverage of a set of retrieved paragraphs. In both evaluations, the proposed approach outperformed the other approaches. CONTACT [email protected].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental Text Structuring with Online Hierarchical Ranking

Many emerging applications require documents to be repeatedly updated. Such documents include newsfeeds, webpages, and shared community resources such as Wikipedia. In this paper we address the task of inserting new information into existing texts. In particular, we wish to determine the best location in a text for a given piece of new information. For this process to succeed, the insertion alg...

متن کامل

Paragraph retrieval for why-question answering Exploiting discourse structure for intelligent paragraph retrieval for why-QA

Finding answers to why-questions involves finding arguments in texts, rather than the noun phrases that are typical targets for factoid questions. Detecting arguments requires detecting specific rhetorical structures and relations. Therefore, we proposed the use of Rhetorical Structure Theory (RST) as a tool for discovering answer to why-questions in paragraphs that are likely to contain the an...

متن کامل

Exploiting discourse structure for intelligent paragraph retrieval for why-QA

Finding answers to why-questions involves finding arguments in texts, rather than the noun phrases that are typical targets for factoid questions. Detecting arguments requires detecting specific rhetorical structures and relations. Therefore, we proposed the use of Rhetorical Structure Theory (RST) as a tool for discovering answer to why-questions in paragraphs that are likely to contain the an...

متن کامل

Generative Paragraph Vector

The recently introduced Paragraph Vector is an efficient method for learning highquality distributed representations for pieces of texts. However, an inherent limitation of Paragraph Vector is lack of ability to infer distributed representations for texts outside of the training set. To tackle this problem, we introduce a Generative Paragraph Vector, which can be viewed as a probabilistic exten...

متن کامل

The Comparative Effect of Using Idioms in Conversation and Paragraph Writing on EFL Learners’ Idiom Learning

This study investigated the comparative effect of teaching idiomatic expressions through practicing them in conversation and paragraph writing on intermediate EFL learners’ idiom learning. The participants were sorted out of a population of 134 intermediate students in Zabansara Language School in Khorramabad based on their scores on a Preliminary English Test (PET) and an idiom test piloted in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 27 8  شماره 

صفحات  -

تاریخ انتشار 2011